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Abstract 

This paper adds counterfactuals to the framework of knowledge-based programs 
of Fagin, Halpern, Moses, and Vardi [1995, 1997]. The use of counterfactuals is 
illustrated by designing a protocol in which an agent stops sending messages once 
it knows that it is safe to do so. Such behavior is difficult to capture in the original 
framework because it involves reasoning about counterfactual executions, including 
ones that are not consistent with the protocol. Attempts to formalize these notions 
without counterfactuals are shown to lead to rather counterintuitive behavior. 
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1 Introduction 



Knowledge-based programs, first introduced by Halpern and Fagin [1989] and further 
developed by Fagin, Halpern, Moses, and Vardi [1995, 1997], are intended to provide a 
high-level framework for the design and specification of protocols. The idea is that, in 
knowledge-based programs, there are exphcit tests for knowledge. Thus, a knowledge- 
based program might have the form 

if K{x — 0) then y := y + 1 else skip, 

where K{x = 0) should be read as "you know x = 0" and skip is the action of doing 
nothing. We can informally view this knowledge-based program as saying "if you know 
that X = 0, then set y to y + 1 (otherwise do nothing)". 

Knowledge-based programs are an attempt to capture the intuition that what an 
agent does depends on what it knows. They have been used successfully in papers such as 
[Dwork and Moses 1990; Hadzilacos 1987; Halpern, Moses, and Waarts 2001; Halpern and 
Zuck 1992; Mazer and Lochovsky 1990; Mazer 1990; Moses and Tuttle 1988; Neiger and 
Toucg 1993] both to help in the design of new protocols and to clarify the understanding 
of existing protocols. However, as we show here, there are cases when, used naively, 
knowledge-based programs exhibit some quite counterintuitive behavior. We then show 
how this can be overcome by the use of counterfactuals [Lewis 1973; Stalnaker 1968]. In 
this introduction, we discuss these issues informally, leaving the formal details to later 
sections of the paper. 

Some counterintuitive aspects of knowledge-based programs can be understood by 
considering the bit-transmission problem from [Fagin, Halpern, Moses, and Vardi 1995]. 
In this problem, there are two processes, a sender S and a receiver R, that communicate 
over a communication line. The sender starts with one bit (cither or 1) that it wants to 
communicate to the receiver. The communication line may be faulty and lose messages 
in either direction in any given round. That is, there is no guarantee that a message 
sent by either S or R will be received. Because of the uncertainty regarding possible 
message loss, S sends the bit to R in every round, until S receives an ack message 
from R acknowledging receipt of the bit. R starts sending the ack message in the round 
after it receives the bit, and continues to send it repeatedly from then on. The sender S 
can be viewed as running the program BT^: 

if recack then skip else sendbit, 

where recack is a proposition that is true if S has already received an ack message from R 
and false otherwise, while sendbit is the action of sending the bit.^ Note that BT5 is a 
standard program — it does not have tests for knowledge. We can capture some of the 
intuitions behind this program by using knowledge. The sender S keeps sending the bit 

^Running such a program amounts to performing the statement repeatedly forever. 
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until an acknowledgment is received from the receiver R. Thus, another way to describe 
the sender's behavior is to say that S keeps sending the bit until it knows that the bit was 
received by R. This behavior can be characterized by the knowledge-based program BT^: 

if Ks{recbit) then skip else sendbit, 

where recbit is a proposition that is true once R has received the bit. The advantage of 
this program over the standard program BT5 is that it abstracts away the mechanism 
by which S learns that the bit was received by R. For example, if messages from S to R 
are guaranteed to be delivered in the same round in which they are sent, then S knows 
that R received the bit even if S does not receive an acknowledgment. 

We might hope to improve this even further. Consider a system where all messages 
sent are guaranteed to be delivered, but rather than arriving in one round, they spend 
exactly five rounds in transit. In such a system, a sender using BT5 will send the bit 
10 times, because it will take 10 rounds to get the receiver's acknowledgment after the 
original message is sent. The program BT'^ is somewhat better; using it S sends the 
bit only five times, since after the fifth round, S will know that R got his first message. 
Nevertheless, this seems wasteful. Given that messages are guaranteed to be delivered, it 
clearly suffices for the sender to send the bit once. Intuitively, the sender should be able 
to stop sending the message as soon as it knows that the receiver will eventually receive 
a copy of the message; the sender should not have to wait until the receiver actually 
receives it. 

It seems that there should be no problem handling this using knowledge-based pro- 
grams. Let O be the standard "eventually" operator from temporal logic [Manna and 
Pnueli 1992]; 0(f means that (p is eventually true, and let □ be its dual, "always". Now 
the following knowledge-based program BT5 for the sender should capture exactly what 
is required: 

if Ks{Orecbit) then skip else sendbit. 

Unfortunately, BT^ does not capture our intuitions here. To understand why, consider 
the sender S. Should it send the bit in the first round? According to BT^, the sender 
S should send the bit if S does not know that R will eventually receive the bit. But 
if S sends the bit, then S knows that R will eventually receive it (since messages are 
guaranteed to be delivered in 5 rounds). Thus, S should not send the bit. Similar 
arguments show that S should not send the bit at any round. On the other hand, if 
S never sends the bit, then R will never receive it and thus S should send the bit! It 
follows that according to BT^, S should send the bit exactly if it will never send the bit. 
Obviously, there is no way S can follow such a program. Put another way, this program 
cannot be implemented by a standard program at all. This is certainly not the behavior 
we would intuitively have expected of BT^.^ 

^While intuitions may, of course, vary, some evidence of the counterintuitive behavior of this program 
is that it was used in a draft of [Fagin, Halpern, Moses, and Vardi 1995]; it was several months before 
we realized its problematic nature. 
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One approach to dealing with this problem is to change the semantics of knowledge- 
based programs. Inherent in the semantics of knowledge-based programs is the fact that 
an agent knows what standard protocol she is following. Thus, if the sender is guaranteed 
to send a message in round two, then she knows at time one that the message will be 
sent in the following round. Moreover, if communication is reliable, she also knows the 
message will later be received. If we weaken the semantics of knowledge sufficiently, 
then this problem disappears. (See [Engelhardt, van der Meyden, and Moses 1998] for an 
approach to dealing with the problem addressed in this paper along these lines.) However, 
it is not yet clear how to make this change and still maintain the attractive features of 
knowledge-based programs that we discussed earlier. 

In this paper we consider another approach to dealing with the problem, based on 
counterf actuals. Our claim is that the program BT^ does not adequately capture our 
intuitions. Rather than saying that S should stop sending if S knows that R will even- 
tually receive the bit we should, instead, say that S should stop sending if it knows that 
even if S does not send another message R will eventually receive the bit. 

How should we capture this? Let do{i, a) be the formula that is true at a point (r, m) 
if process i performs a in the next round. ^ The most obvious way to capture "(even) 
if S docs not send a message then R will eventually receive the bit" uses standard 
implication, also known as material implication or material conditional in philosophical 
logic: do(5', skip) =^ recbit. This leads to a program such as BT^: 

if Ks{do{S, sk\p) =^ Orechit) then skip else sendbit. 

Unfortunately, this program does not solve our problems. It, too is not implementable 
by a standard program. To see why, suppose that there is some point in the execution 
of this protocol where S sends a message. At this point 5* knows it is sending a message, 
so S knows that do{S, skip) is false. Thus, S knows that do{S, skip) =^ Orechit holds. 
As a result, Ks{do{S, skip) =^ Orechit) is true, so that the test in BT^ succeeds. Thus, 
according to BT^, the sender S should not send a message at this point. On the other 
hand, if S never sends a message according to the protocol (under any circumstance), 
then S knows that it will never send a message (since, after all, S knows how the protocol 
works). But in this case, S knows that the receiver will never receive the bit, so the test 
fails. Thus, according to BT^, the sender S should send the message as its first action, 
this time contradicting the assumption that the message is never sent. Nothing that S 
can do is consistent with this program. 

The problem here is the use of material implication (=^). Our intuitions are better 
captured by using counterfactual implication, which we denote by >. A statement such 

as > -0 is read "if if then just like (p ^ ip. However, the semantics of > is very 
different from that of The idea, which goes back to Stalnaker [1968] and Lewis [1973] 
is that a statement such as is true at a world w if in the worlds "closest to" or 

•^We assume that round m takes place between time to — 1 and to. Thus, the next round after (r, m) 
is round to + 1, which takes takes place between (r, to) and (r, to + 1). 
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"most like" w where <^ is true, if) is also true. This attempts to capture the intuition that 
the count erf actual statement > ip stands for "if ip were the case, then if) would hold". 
For example, suppose that we have a wet match and we make a statement such as "if the 
match were dry then it would light". Using =^ this statement is trivially true, since the 
antecedent is false. However, with >, the situation is not so obvious. We must consider 
the worlds most like the actual world where the match is in fact dry and decide whether 
it would light in those worlds. If we think the match is defective for some reason, then 
even if it were dry, it would not light. 

A central issue in the application of counterfactual reasoning to a concrete problem is 
that we need to specify what the "closest worlds" are. The philosophical literature does 
not give us any guidance on this point. We present some general approaches for doing 
so, motivated by our interest in modeling counterfactual reasoning about what would 
happen if an agent were to deviate from the protocol it is following. We believe that this 
example can inform similar applications of counterfactual reasoning in other contexts. 

There is a subtle technical point that needs to be addressed in order to use countcr- 
factuals in knowledge-based programs. Traditionally, we talk about a knowledge-based 
program Pgj^j being implemented by a protocol P. This is the case when the behavior 
prescribed by P is in accordance with what Pg^j, specifies. To determine whether P 
implements Pg^j, the knowledge tests (tests for the truth of formulas of the form KiLp) 
in Pgj,;, are evaluated with respect to the points appearing in the set of runs of P. In 
this system, all the agents know that the properties of P (e.g. facts like process 1 al- 
ways sending an acknowledgment after receiving a message from process 2) hold in all 
runs. But this set of runs does not account for what may happen if (counter to fact) 
some agents were to deviate from P. In counterfactual reasoning, we need to evaluate 
formulas with respect to a larger set of runs that allows for such deviations. 

We deal with this problem by evaluating countcrfactuals with respect to a system 
consisting of all possible runs (not just the ones generated by P). While working with 
this larger system enables us to reason about countcrfactuals, processes no longer know 
the properties of P in this system, since it includes many runs not in P. In order to 
deal with this, we add a notion of likelihood to the system using what are called ranking 
functions [Spohn 1988]. Runs generated by P get rank 0; all other runs get higher rank. 
(Lower ranks imply greater likelihood.) Ranks let us define a standard notion of belief. 
Although a process does not know that the properties of P hold, it believes that they 
do. Moreover, when restricted to the set of runs of the original protocol P, this notion of 
belief satisfies the knowledge axiom Bi(^ =^ ip, and coincides with the notion of knowledge 
we had in the original system. Thus, when the original protocol is followed, our notion 
of belief acts essentially like knowledge. 

Using the counterfactual operator and this interpretation for belief, we get the pro- 
gram BT^: 

if B s {do {S., skip) > Orecbit) then skip else sendbit. 
We show that using countcrfactuals in this way has the desired effect here. If message 
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delivery is guaranteed, then after the message has been sent once, under what seems 

to be the most reasonable interpretation of "the closest world" where the message is 
not sent, the sender believes that the bit will eventually be received. In particular, in 
contexts where messages are delivered in five rounds, using BT^, the sender will send one 
message. 

As we said, one advantage of BT'^ over the standard program BT5 is that it abstracts 
away the mechanism by which S learns that the bit was received by R. We can abstract 
even further. The reason that S keeps sending the bit to R is that S wants R to know 
the value of the bit. Thus, intuitively, S should keep sending the bit until it knows that 
R knows its value. Let Kn^bit) be an abbreviation for Kji{bit = 0) V Kji{bit = 1), so 
Kji{bit) is true precisely if R knows the value of the bit. The sender's behavior can be 
characterized by the following knowledge-based program, BT^ : 

if KsKji{bit) then skip else sendbit. 

Clearly when a message stating the value of the bit reaches the receiver, Kfi[bit) holds. 
But it also holds in other circumstances. If, for example, the KsKRibit) holds initially, 
then there is no need to send anything. 

As above, it seems more efficient for the sender to stop sending when he knows that 
the receiver will eventually know the value of the bit. This suggests using the following 
program: 

if Ks{do{S^s\(\p) =^ OKji{bit)) then skip else sendbit. 

However, the same reasoning as in the case of BT^ shows that this program is not 
implementable. And, again, using belief and counterfactuals, we can get a program 
BT^^ that does work, and uses fewer messages than BT^. In fact, the following program 
does the job: 

if Bs{do{S, sk\p) > OBR{bit)) then skip else sendbit, 

except that now we have to take BR^bit) to be an abbreviation for {bit = A Bji{bit = 
0)) y {bit ^lA BR{bit = 1)). Note that KR{bit), which was defined to be KR{bit = 0)) V 
KR{bit = 1)), is logically equivalent to {bit = OAKR{bit = 0)) V {bit = lAKR{bit = 1)), 
since Krlp ^ 99 is valid for any formula ip. But, in general, Brlp =^ 99 is not valid, so 
adding the additional conjuncts in the case of belief makes what turns out to be quite an 
important difference. Intuitively, BR{bit) says that R has correct beliefs about the value 
of the bit. 

The rest of this paper is organized as follows: In the next section, there is an informal 
review of the semantics of knowledge-based programs. Section 3 extends the knowledge- 
based framework by adding counterfactuals and beliefs. Wc then formally analyze the 
programs BT^ and BT^^, showing that they have the appropropriate properties. We 
conclude in Section 4. 
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2 Giving semantics to knowledge-based programs 



Formal semantics for knowledge-based programs are provided by Fagin, Halpern, Moses, 
and Vardi [1995, 1997]. To keep the discussion in this paper at an informal level, we 
simplify things somewhat here, and review what we hope will be just enough of the 
details so that the reader will be able to follow the main points. All the definitions in 
this section, except that of de facto implementation at the end of the section, are taken 
from [Fagin, Halpern, Moses, and Vardi 1995]. 

Informally, we view a multi-agent system as consisting of a number of interacting 
agents. We assume that, at any given point in time, each agent in the system is in some 
local state. A global state is just a tuple consisting of each agent's local state, together 
with the state of the environment, where the environment's state accounts for everything 
that is relevant to the system that is not contained in the state of the processes. The 
agents' local states typically change over time, as a result of actions that they perform. 
A run is a function from time to global states. Intuitively, a run is a complete description 
of what happens over time in one possible execution of the system. A point is a pair 
(r, m) consisting of a run r and a time m. If r(m) = {£e,£i, ■ ■ ■ ,£n), then we use ri{m) 
to denote process i's local state £j at the point (r.m), i = 1, . . . ,n and r(.{m) to denote 
the environment's state £e- For simplicity, time here is taken to range over the natural 
numbers rather than the reals (so that time is viewed as discrete, rather than dense or 
continuous). Round m in run r occurs between time m — 1 and m. A system 7?. is a 
set of runs; intuitively, these runs describe all the possible executions of the system. For 
example, in a poker game, the runs could describe all the possible deals and bidding 
sequences. 

Of major interest in this paper are the systems that we can associate with a program. 
To do this, we must first associate a system with a joint protocol. A protocol is a function 
from local states to nonempty sets of actions. (We often consider deterministic protocols, 
in which a local state is mapped to a singleton set of actions. Such protocols can be viewed 
as functions from local states to actions.) A joint protocol is just a set of protocols, one 
for each process/agent. 

We would like to be able to generate the system corresponding to a given joint pro- 
tocol P. To do this, we need to describe the setting, or context, in which P is being 
executed. Formally, a context 7 is a tuple {Pe,Go,T,^), where Pe is a protocol for the 
environment, Qo is a set of initial global states, r is a transition function, and is a set 
of admissible runs. The environment is viewed as running a protocol just like the agents; 
its protocol is used to capture features of the setting such as "all messages are delivered 
within 5 rounds" or "messages may be lost". The transition function r describes how 
the actions performed by the agents and the environment change the global state by 
associating with each joint action (a tuple consisting of an action for the environment 
and one for each of the agents) a global state transformer, that is, a mapping from global 
states to global states. For the simple programs considered in this paper, the transition 
function will be almost immediate from the description of the global states. The set ^ of 
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admissible runs is useful for capturing various fairness properties of the context. Typi- 
cally, when no fairness constraints are imposed, ^ is the set of all runs. (For a discussion 
of the role of the set ^ of admissible runs see [Fagin, Halpern, Moses, and Vardi 1995].) 
Since our focus in this paper is reasoning about actions and when they are performed, 
we assume that all contexts are such that the environment's state at the point (r, m) 
records the joint action performed in the previous round (that is, between (r, m — 1) 
and (r, m)). (Thus, we are essentially considering what are called recording contexts in 
[Fagin, Halpern, Moses, and Vardi 1995].) 

A run r is consistent with a protocol P if it could have been generated when running 
protocol P. Formally, run r is consistent with joint protocol P in context 7 if r e (so r 
is admissible according to the context 7), its initial global state r(0) is one of the initial 
global states Qo given in 7, and for all m, the transition from global state r(m) to r{m+l) 
is the result of performing one of the joint actions specified by P and the environment 
protocol Pe (given in 7) in the global state r{m). That is, if P = (Pi,...,P„), P,. is 
the environment's protocol in context 7, and r(m) = (4, ^1, • • • , then there must 
be a joint action (ae, ai, . . . , a„) such that e Pei^e), £ Pii^i) ior i — 1, . . . ,n, and 
r(m + 1) = T(ae, ai, . . . , a„)(r(m)) (so that r(m + 1) is the result of applying the joint 
action (ae, ai, . . . , a„) to r(m). For future reference, we will say that a run r is consistent 
with 7 if r is consistent with some joint protocol P in 7. A system TZ represents a joint 
protocol P in a context 7 if it consists of all runs in consistent with P in 7. We use 
R(P, 7) to denote the system representing P in context 7. 

The basic logical language C that we use is a standard propositional temporal logic. 
We start out with a set $ of primitive propositions p,q, . . . (which are sometimes given 
more meaningful names such as recbit or recack). Every primitive proposition is consid- 
ered to be a formula of C. We close ofi: under the Boolean operators A (conjunction) and 
-1 (negation). Thus, if ip and if^ are formulas of C, then so are -k/? and ip Aif^. The other 
Boolean operators are definable in terms of these. E.g., implication =^ -0 is defined 
as -<{~"f A ip). Finally, we close off under temporal operators. For the purposes of this 
paper, it suffices to consider the standard linear-time temporal operators Q ( "in the next 
(global) state')' and O ("eventually"): If 1^ is a formula, then so are Qip and Oip. The 
dual of O, which stands for "forever," is denoted by □ and defined to be shorthand for 
-lO-i. This completes the definition of the language. 

In order to assign meaning to the formulas of such a language £ in a system TZ, we 
need an interpretation n, which determines the truth of the primitive propositions at 
each of the global states of TZ. Thus, tt : $ x ^ — > {true, false}, where nip.g) = true 
exactly if the proposition p is true at the global state g. An interpreted system is a pair 
I — {TZ, tt) where TZisa system as before, and tt is an interpretation for $ in TZ. Formulas 
of jC are considered true or false at a point (r, m) with respect to an interpreted system 
X — {TZ, tt) where r E TZ. Formally, 

• {X, r, m) ^ p, for p e ^, iS 7r{p, r(m)) = true. 

• {I, r, m) ^ -K^, iff {I, r, m) ^ ip. 
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• {I, r,m) \^ (fi Alp, iff botfi {I, r,m) \^ (p and {I, r, m) |= 



• (X, r, m) 1= Ov^j iff r, m + 1) |= (/?. 

• (X, r, m) 1= 0(/?, iff (X, r, m') |= (/? for some m' > m. 

By adding an interpretation tt to tfie context 7, we obtain an interpreted context (7,7r). 

We now describe a simple programming language, introduced in [Fagin, Halpern, 
Moses, and Vardi 1995], which is still rich enough to describe protocols, and whose 
syntax emphasizes the fact that an agent performs actions based on the result of a test 
that is applied to her local state. A (standard) program for agent i is a statement of the 
form: 

case of 

if ti do ai 

if t2 do 32 



where the t/s are standard tests for agent i and the a/s are actions of agent i (i.e., 
Bj e ACTi). (We later modify these programs to obtain knowledge-based and belief- 
based programs; the distinction will come from the kinds of tests allowed. We omit the 
case statement if there is only one clause.) A standard test for agent i is simply a 
propositional formula over a set of primitive propositions. Intuitively, if Lj represents 
the local states of agent i in Q, then once we know how to evaluate the tests in the 
program at the local states in Lj, we can convert this program to a protocol over Lf. at 
a local state i, agent i nondeterministically chooses one of the (possibly infinitely many) 
clauses in the case statement whose test is true at £, and executes the corresponding 
action. 

We want to use an interpretation tt to tell us how to evaluate the tests. However, not 
just any interpretation will do. We intend the tests in a program for agent i to be local, 
that is, to depend only on agent i's local state. It would be inappropriate for agent i's 
action to depend on the truth value of a test that i could not determine from her local 
state. An interpretation tt on the global states in Q is compatible with a program Pgj 
for agent i if every proposition that appears in Pg^ is local to i; that is, if q appears 
in Pgj, the states s and s' are in Q, and s ~i s', then 7r(s)(g) = 7r(s')(g). If 93 is a 
propositional formula all of whose primitive propositions are local to agent i, and £ is a 
local state of agent i, then we write (tt, £) \^ (p ii (p is satisfied by the truth assignment 
7r(s), where s — (sg, Si, . . . , s„) is a global state such that Sj = £. Because all the primitive 
propositions in ip are local to i, it does not matter which global state s we choose, as 
long as i's local state in s is £. Given a program Pg^ for agent i and an interpretation tt 
compatible with Pgj, we define a protocol that we denote PgJ" by setting: 



end case 




{a, : (7r,£)ht,} if : (tt, ^) |= t,} ^ 
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Intuitively, Pgf selects all actions from the clauses that satisfy the test, and selects the 
null action skip if no test is satisfied. In general, we get a nondeterministic protocol, since 

more than one test may be satisfied at a given state. 

Many of the definitions that we gave for protocols have natural analogues for pro- 
grams. We define a joint program to be a tuple Pg = (Pgi, . . . , Pg„), where Pg^ is a 
program for agent i. An interpretation tt is compatible with Pg if tt is compatible with 
each of the Pg/s. From Pg and vr we get a joint protocol Pg'^ = (Pgi, ■ ■ ■ > Pgn)- We say 
that an interpreted system I = {TZ, vr) represents a joint program Pg in the interpreted 
context (7, tt) exactly if tt is compatible with Pg and X represents the corresponding 
protocol Pg'^. We denote the interpreted system representing Pg in (7,7r) by I(Pg,7,7r). 
Of course, this definition only makes sense if tt is compatible with Pg. From now on we 
always assume that this is the case. 

The syntactic form of our standard programs is in many ways more restricted than 
that of programs in common programming languages such as C or FORTRAN. In such 
languages, one typically sees constructs such as for, while, or if. . . then. . . else. . . , which 
do not have syntactic analogues in our formalism. As discussed in [Fagin, Halpern, 
Moses, and Vardi 1995], it is possible to encode a program counter in tests and actions 
of standard programs. By doing so, it is possible to simulate these constructs. Hence, 
there is essentially no loss of generality in our definition of standard programs. 

Since each test in a standard program Pg run by process i can be evaluated in each 
local state, we can derive a protocol from Pg in an obvious way: to find out what pro- 
cess i does in a local state £, we evaluate the tests in the program in i and perform the 
appropriate action. A run is consistent with Pg in interpreted context (7, tt) if it is consis- 
tent with the protocol derived from Pg. Similarly, a system represents Pg in interpreted 
context (7, tt) if it represents the protocol derived from Pg in (7,77). 

Example 2.1 Consider the (joint) program BT = (BT5, BT^^), where BT5 is as defined 
in the introduction, and BT^^ is the program 

if rechit then sendack else skip. 

Thus, in BT^j, the receiver sends an acknowledgement if it has received the bit, and 
otherwise does nothing. This program, like all the programs considered in this paper, 
is applied repeatedly, so it effectively runs forever. Assume that S"s local state in- 
cludes the time, its input bit, and whether or not S has received an acknowledgment 
from i?; the state thus has the form {m,i,x), where m is a natural number (the time), 
i e {0, 1} is the input bit, and x e {A, ack}. Similarly, i?'s local state has the form 
(m, x), where m is the time and x is either A, 0, or 1, depending on whether or not it has 
received the bit from S and what the bit is. As in all recording contexts, the environ- 
ment state keeps track of the actions performed by the agents. Since the environment 
state plays no role here, we omit it from the description of the global state, and just 
identify the global state with the pair consisting of S and i?'s local state. Suppose that. 
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in context 7, the environment protocol nondeterministically decides whether or not a 
message sent by S and/or R is dehvered, the initial global states are ((0, 0, A), (0, A)) and 
((0, 1, A), (0, A)), the transition function is such that the joint actions have the obvious 
effect on the global state, and all runs are admissible. Then a run consistent with BT 
in (7, tt) in which 5"s bit is 0, R receives the bit in the second round, and 5* receives 
an acknowledgment from R in the third round has the following sequence of global states: 
((0, 0, A), (0, A)), ((1, 0, A), (1, A)), ((2, 0, A), (2, 0)), ((3, 0, ack), (3, 0)), ((4, 0, ack), (4, 0)), . . . 
I 

Now we consider knowledge-based programs. We start by extending our logical lan- 
guage by adding a modal operator Ki for every agent i — 1, . . . ,n. Thus, whenever (p 
is a formula, so is KiLp. Let Ck be the resulting language. According to the standard 
definition of knowledge in systems [Fagin, Halpern, Moses, and Vardi 1995], an agent i 
knows a fact <^ at a given point (r, m) in interpreted system X = (JZ, tt) if ip is true at all 
points in TZ where i has the same local state as it does at (r, m) . We now have 

• {I, r, m) 1= Kiip if (T, r', m') |= </? for all points (r', ml) such that ri{m) — r[{m'). 

Thus, i knows ip at the point (r, m) if p holds at all points consistent with i's information 
at (r, m). 

A knowledge-has ed program has the same structure as a standard program except 
that all tests in the program text Pgj for agent i are formulas of the form Kiifj^ As 
for standard programs, we can define when a protocol implements a knowledge-based 
program, except this time it is with respect to an interpreted context. The situation 
in this case is, however, somewhat more complicated. In a given context, a process 
can determine the truth of a standard test such as "x = 0" by simply checking its 
local state. However, the truth of the tests for knowledge that appear in knowledge- 
based programs cannot in general be determined simply by looking at the local state 
in isolation. We need to look at the whole system. As a consequence, given a run, we 
cannot in general determine if it is consistent with a knowledge-based program in a given 
interpreted context. This is because we cannot tell how the tests for knowledge turn out 
without being given the other possible runs of the system; what a process knows at one 
point will depend in general on what other points are possible. This stands in sharp 
contrast to the situation for standard programs. 

This means it no longer makes sense to talk about a rim being consistent with a 
knowledge-based program in a given context. However, notice that, given an interpreted 
system X — (7?., tt), we can derive a protocol from a knowledge-based program Pgj^^ 
for process i by evaluating the knowledge tests in Pg;,^ with respect to X. That is, a 
test such as Kip holds in a local state £ if holds at all points (r, m) in X such that 

^All standard programs can be viewed as knowledge-based programs. Since all the tests in a standard 
program for agent i must be local to i, every test in a standard program for agent i is equivalent to 
Knp. 
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rj(m) = In general, different protocols can be derived from a given knowledge-based 
program, depending on what system we use to evaluate the tests. Let Pg^j, denote the 
protocol derived from Pg^j, by using X to evaluate the tests for knowledge. An interpreted 
system X represents the knowledge-based program Pg^^ in interpreted context (7, n) if 
X represents the protocol Pgf^. That is, X represents Pgj.^ if X = I(Pg^j, 7, tt). Thus, 
a system represents Pg^b if it satisfies a certain fixed-point equation. A protocol P 
implements Pghb in interpreted context (7,77) if P = P^kb'''''^^ ■ 

This definition is somewhat subtle, and determining the protocol(s) implementing a 
given knowledge-based program may be nontrivial. Indeed, as shown by Fagin, Halpern, 
Moses, and Vardi [1995, 1997], in general, there may be no protocols implementing a 
knowledge-based program Pg^^ in a given context, there may be only one, or there may 
be more than one, since the fixed-point equation may have no solutions, one solution, 
or many solutions. In particular, it is not hard to show that there is no (joint) pro- 
tocol implementing a (joint) program where S uses BT^ or BT^, as described in the 
introduction. 

For the purposes of this paper, it is useful to have a notion slightly weaker than that 
of implementation. Two joint protocols P = {Pi, . . . , P„) and P' — (P/, . . . , P^) are 
equivalent in context 7, denoted P ?»-y P', if (a) R(P, 7) = R(P', 7) and (b) Pi{i) — 
Pl{€} for every local state (. — ri{m) with r e R(P, 7). Thus, two protocols that arc 
equivalent in 7 may disagree on the actions performed in some local states, provided 
that those local states never arise in the actual runs of these protocols in 7. We say 
P de facto implements a knowledge-based program Pg^.;, in interpreted context (7, tt) if 
P Pg]jjj^''^''^\ Arguably, de facto implementation suffices for most purposes, since all 
we care about are the runs generated by the protocol. We do not care about the behavior 
of the protocol on local states that never arise. 

It is almost immediate from the definition that if P implements Pg^;,, then P de facto 
implements Pg^f,- The converse may not be true, since we may have P PgW^'^'^"^ 
without having P = Pg^[,^'^''^'* . On the other hand, as the following lemma shows, if P 
de facto implements Pg^j, then a protocol closely related to P implements Pg^;,- 

Lemma 2.2 If P de facto implements Pg^b in (7,77) then Pg^b'''^'^^ implements Pg^.^ in 
(7,7r). 

Proof Suppose that P de facto implements Pgj^.^, in (7,7r). Let P' = Pg^j,^'''^''^''- By 
definition, P' P. Thus, I(P',7,7r) = I(P,7,7r), so P' = Pg^r'^'"^- It follows that P' 
implements Pgj^b- I 



^Note that if there is no point (r, m) in X such that ri{m) = i, then Knp vacuously holds at i, for all 
formulas ip. 
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3 Counterfactuals and Belief 



In this section, we show how counterfactuals and behef can be added to the knowledge- 
based framework, and use them to do a formal analysis of the programs BT^ and BT^^ 
from the introduction. 

3.1 Counterfactuals 

The semantics we use for counterfactuals is based on the standard semantics used in 
the philosophy literature [Lewis 1973; Stalnaker 1968]. As with other modal logics, this 
semantics starts with a set W of possible worlds. For every possible world w & W 
there is a (partial) order <^ defined on W. Intuitively, Wi <yj W2 if Wi is "closer" or 
"more similar" to world w than W2 is. This partial order is assumed to satisfy certain 
constraints, such as the condition that w <w w' for all w' ^ w: world w is closer to w 
than any other world is. A counterfactual statement of the form ip > ip is then taken to 
be true at a world w if, in all the worlds closest to w among the worlds where (p is true, 
■0 is also true. 

In our setting, we obtain a notion of closeness by associating with every point (r, m) 
of a system X a partial order on the points of I.^ An order assignment for a sys- 
tem X = {TZ, tt) is a function < that associates with every point (r, m) of X a partial 
order relation <i(r,m) over the points of X. The partial orders must satisfy the constraint 
that (r, m) is a minimal element of <(r,m), so that there is no run r' eTZ and time m' >0 
satisfying (r', m') <i(^r,m)i'i^, "m)- A counterfactual system is a pair of the form J' — (X, <.), 
where X is an interpreted system as before, while <: is an order assignment for the points 
in X. Given a counterfactual system J' — (X, <), a point (r, m) in X, and a set A of 
points of X, define 

closest {A, (r, m),J) = 

{{r',m') e A : there is no {r" ,m") e A such that {r",m") <i(^r,m) {r',m')}. 

Thus, closest (A, (r, m), J') consists of the closest points to (r, m) among the points in A 
(according to the order assignment <). 

To allow for counterfactual statements, we extend our logical language C with a binary 

operator > on formulas, so that whenever (/? and if) are formulae, so is (/? > if). We read 
(f > if) as ^^if Lp were the case, then if)," and denote the resulting language by 

Let Iv?] = {(r, m) : {J,r,m) \= p}\ that is, \lp\ consists of all points in J satisfying 
99. We can now define the semantics of counterfactuals as follows: 

{J ,r,m) \^ if > if) if (J',r',m')^V for all (r', m') e closest ([lys], (r, m), J). 

^In a more general treatment, we could associate a different partial order with every agent at every 
point; this is not necessary for the examples we consider in this paper. 
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This definition captures the intuition for counterfactuals stated earher: (p > tp is true at 
a point (r, m) if i/j is true at the points closest to (r, m) where ip is true. 

All earlier analyses of (epistemic) properties of a protocol P in a context 7 used the 
interpreted system I(P, 7,7r), consisting of all the runs consistent with P in context 7. 
However, counterfactual reasoning involves events that occur on runs that are not con- 
sistent with P. To support such reasoning we need to consider runs not in I(P, 7, tt). 
The runs that must be added can, in general, depend on the type of counterfactual state- 
ments allowed in the logical language. Thus, for example, if we allow formulas of the 
form do{i,a) > if) for process i and action a, then we must allow, at every point of the 
system, a possible future in which ?'s next action is 

An even richer set of runs is needed if we allow the language to specify a sequence of 
actions performed by a given process, or if counterfactual conditionals > can be nested. 
To handle a broad class of applications, including ones involving formulas with tempo- 
ral operators and arbitrary nesting of conditional statements involving do(i, a), we do 
reasoning with respect to the system X+(7,7r) = (7^+(7),7r) consisting of a// runs com- 
patible 7, that is, all runs consistent with some protocol P' in context 7. In this way all 
possible behaviors, within the constraints induced by 7, can be reasoned about. There is 
a potential problem with using system X+(7, tt) = {1Z'^{^),tt) for reasoning about P: all 
reference to P has been lost. We return to this issue in the next section, when we discuss 
belief. For now we show how to use X+(7, vr) as a basis for doing counterfactual reasoning. 

As we have already discussed, the main issue in using X"'"(7, tt) to reason about P is 
that of defining an appropriate order assignment. We are interested in order assignments 
that depend on the protocol in a uniform way. An order generator for a context 7 
is a function that associates with every protocol P an order assignment = o{P) on 
the points of 7^+ (7). A counterfactual context is a tuple ( = (7,7r, o), where o is an 
order generator for 7. In what follows we denote by J'^{P, C) the counterfactual system 
(X+(7, tt), o(P)), where C = (Ti ti", o); we omit Q when it is clear from context. 

We are interested in order generators o such that o(P) says something about devi- 
ations from P. For the technical results we prove in the rest of the paper, we focus 
on order generators that prefer runs in which the agents do not deviate from their pro- 
tocol. Given an agent 2, action a, protocol P, context 7, and point (r, m) in 7^+(7), 
define close(i, a, P, 7, (r, m)) = {(r',m) : (a) r' G 7^^(7), (b) r'(m') = r(m') for all 
m' < m, (c) if agent i performs a in round m + 1 of r, then r' = r, (d) if agent i does 

''Recall from the introduction that our programs use the formula do{i,a) to state that agent i is 
about to perform action a. Thus, do{i,a) > ip says "if agent i were to perform a then ip would be the 
case." Wc assimic that all interpretations we consider give this formula the appropriate meaning. If the 
protocol P being used is encoded in the global state (for example, if it is part of the environment state), 
then we can take do{i, a) to be a primitive proposition. Otherwise, we cannot, since its truth cannot 
be determined from the global state. However, wc can always take do(z, a) to be an abbreviation for 
Q)last{i, a), where the interpretation tt ensures that last{i, a) is true at a point (r, m) if i performed a in 
round m of r. Since we assume the the last joint action performed is included in the environment state, 
the truth of last{i, a) is determined by the global state. 
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not perform perform a in round m + 1 of r, then agent i performs a in round m + 1 
of r' and follows P in all later rounds, (e) all agents other than i follow P from round 
m + 1 on in r'}. That is, close(i, a, P, 7, (r, m)) is the set of points {r',m) where 
run r' = r ii i performs a in round m + 1 of r; otherwise, r' is identical to r up 
to time m and all the agents act according to joint protocol P at all later times, 
except that at the point (r',m), agent i performs action a. An order generator o 
for 7 respects protocols if, for every protocol P, point {r,m) of R(P, 7), action a, and 
agent i, closest(|do(i, a)], (r, m), J^{P)) is a nonempty subset of close(i, a, P, 7, (r, m)) 
that includes {r,m). Of course, the most obvious order generator that respects pro- 
tocols just sets closest([do(i, a)], (r, m), J'+(P)) = close(i, a, P, 7, (r, m)). Since our 
results hold for arbitrary order generators that respect protocols, we have allowed the 
extra flexibility of allowing closest(|do(i, a)], (r, m), J'+(P)) to be a strict subset of 
close(i, a, P, 7, (r, m)). 

A number of points are worth noting about this definition: 

• If the environment's protocol Pg and the agents' individual protocols in P are all 
deterministic, then close(i, a, P, 7, (r, m)) is a singleton, since there is a unique run 
where the agents act according to joint protocol P at all times except that agent i 
performs action a at time m. Thus, closest(|do(z, a)], (r, m), J''{P)) must be the 
singleton close(i, a, P, 7, (r, m)) in this case. However, in many cases, it is best to 
view the environment as following a nondeterministic protocol (for example, non- 
deterministically deciding at which round a message will be delivered); in this case, 
there may be several points in X closest to (r, m). Stalnaker [1968] required there 
to be a unique closest world; Lewis [1973] did not. There was later discussion of 
how reasonable this requirement was (see, for example, [Stalnaker 1980]). Thinking 
in terms of systems may help inform this debate. 

• If process i does not perform action a at the point (r, m), then there may be points 
in closest (|do(z, a)], (r, m), J''^(P)) that are not in R(P,7), even if r e R(P,7). 
These points are "counter to fact" . 

• According to our definition, the notion of "closest" depends on the protocol that 
generates the system. For example, consider a context 7' that is just like the 
context 7 from Example 2.1, except that S keeps track in its local state, not only 
of the time, but also of the number of messages it has sent. Suppose that the 
protocol Ps for S is determined by the program 

if time^O then sendbit else skip, 

while Pg is the protocol determined by the program 

if #messages=0 then sendbit else skip. 

Let P = (P5, SKIPr) and P' = (P^,SKIPr), where Pr is the protocol where R 
does nothing (performs the action skip) in all states. Clearly R(P, 7') — R(P',7'): 
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whether it is following Ps or Pg, the sender S sends a message only in the first 
round of each run. It follows that these two protocols specify exactly the same 
behavior in this context. While these protocols coincide when no deviations take 
place, they may differ if deviations are possible. For example, imagine a situa- 
tion where, for whatever reason, S did nothing in the first round. In that case, 
at the end of the first round, the clock has advanced from 0, while the count of 
the number of messages that S has sent is still 0. P and P' would then pro- 
duce different behavior in the second round. This difference is captured by our 
definitions. If o respects protocols, then closest(|do(5', skip)], (r, 0), ^/^(P)) 7^ 
closest ([do(5', skip)], (r, 0), J''{P')). No messages are sent by S in runs appearing 
in points in closest(|do(S', skip)], (r, 0), J''^(P)), while exactly one message is sent 
by S in each run appearing in points in closest(|do(5', skip)], (r, 0), J'^{P')). 

This dependence on the protocol is a deliberate feature of our definition; by using 
order generators, the order assignment we consider is a function of the protocol 
being used. While the protocols P and P' specify the same behavior in 7, they 
specify different behavior in "counterfactual" runs, where something happens that 
somehow causes behavior inconsistent with the protocol. The subtle difference 
between the two protocols is captured by our definitions. 

3.2 Belief 

As we have just seen, in order to allow for counterfactual reasoning about a protocol P 
in a context 7, our model needs to represent "counterfactual" runs that do not appear 
in R(P, 7). Using the counterfactual system J'^{P), which includes all runs of 7?.'*"(7), 
provides considerable flexibility and generality in counterfactual reasoning. However, 
doing this has a rather drastic impact on the processes' knowledge of the protocol be- 
ing used. Agents have considerable knowledge of the properties of protocol P in the 
interpreted system I(P, 7), since it contains only the runs of R(P, 7). For example, if 
agent I's first action in P is always b, then all agents are guaranteed to know this fact 
(provided that it is expressible in the language, of course); indeed, this fact will be com- 
mon knowledge^ which means agent knows it, for any depth of nesting of these knowledge 
statements (cf. [Fagin, Halpern, Moses, and Vardi 1995; Halpern and Moses 1990]). If 
we evaluate knowledge with respect to then the agents have lost the knowledge 

that they are running protocol P. We deal with this by adding extra information to the 
models that allows us to capture the agents' beliefs. Although the agents will not know 
they are running protocol P, they will believe that they are. 

A ranking function for a system 7^ is a function k : TZ N+, associating with every 
run of 7^ a rank /t(r), which is either a natural number or 00, such that min^gT^ K{r) = 0.^ 

*The similarity in notation with the K-rankings of [Goldszmidt and Pearl 1992], which arc based on 
Spohn's ordinal conditional functions [1988], is completely intentional. Indeed, everything we are saying 
here can be recast in Spohn's framework. 
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Intuitively, the rank of a run defines the hkehhood of the run. Runs of rank are most 
hkely; runs of rank 1 are somewhat less likely, those of rank 2 are even more unlikely, 
and so on. Very roughly speaking, if e > is small, we can think of the runs of rank k 
as having probability 0{e''). For our purposes, the key feature of rankings is that they 
can be used to define a notion of belief (cf. [Friedman and Halpern 1997]). Intuitively, 
of all the points considered possible by a given agent at a point (r, m) , the ones beheved 
to have occurred are the ones appearing in runs of minimal rank. More formally, for a 
point (r, m) define 

min^(r, m) = min{K(r') | r' e T^i'j) and r-(m') = rj(m) for some m' > 0}. 

Thus, min^(r, m) is the minimal K-rank of runs in which rj(m) appears as a local state 
for agent i. 

An extended system is a triple of the form J' = {I, <:, k), where (X, <;) is a counter- 
factual system, and k is a ranking function for the runs of I. In extended systems we 
can define a notion of belief. The logical language that results from closing £^ (resp. £) 
under belief operators Bi, for i = 1, . . . , n, is denoted (resp. Cb)- The truth of Biip 
is defined as follows: 

{I, <i, K, r, m) 1= Bi(p iff (X, <C, k, r', m!) |= ip for all (r', m!) such that 

K{r') = min^(r, m) and r[{m') = ri{m). 

What distinguishes knowledge from belief is that knowledge satisfies the knowledge 
axiom: Ki(fi =^ is valid. While Bi(p =^ (/? is not valid, it is true in runs of rank 0. 

Lemma 3.1 Suppose that J = {{TZ, n), k) is an extended system, r and K{r) — 

0. Then for every formula and all times m, we have {J', r, m) |= B^ip =^ ip. 

Proof Assume that K{r) — 0. Thus, min^(r, m) = for all times m > 0. It now 
immediately follows from the definitions that if {J^, r, m) ^ B^ip, then {J , r, m) \^ (p. § 

By analogy with order generators, we now want a uniform way of associating with 
each protocol P a ranking function. Intuitively, we want to do this in a way that lets 
us recover P. We say that a ranking function k is P-compatible (for 7) if /t(r) = if 
and only if r G R(P, 7). A ranking generator for a context 7 is a function a ascribing 
to every protocol P a ranking o'(P) on the runs of 7^+ (7). A ranking generator a is 
deviation compatible if cr(P) is P-compatible for every protocol P. An obvious example 
of a deviation-compatible ranking generator is the characteristic ranking generator 
that, for a given protocol P, yields a ranking that assigns rank to every run in R(P, 7) 
and rank 1 to all other runs. This captures the assumption that runs of P are likely 
and all other runs are unlikely, without attempting to distinguish among them. Another 
deviation-compatible ranking generator is a*, where the ranking cr*[P) assigns to a run r 
the total number of times that agents deviate from P in r. Obviously, (J*(P) assigns r 



16 



the rank exactly if r e R(P, 7), as desired. Intuitively, a* captures the assumption 
that not only arc deviations unlikely, but they arc independent. It is clearly possible to 
construct other P-compatible rankings that embody other assumptions. For example, 
deviation can taken to be an indication of faulty behavior. Runs of rank k can be those 
where exactly k processes are faulty. 

Our interest in deviation-compatible ranking generators is motivated by the observa- 
tion that the notion of belief that they give rise to in X+(7, tt) generalizes the notion of 
knowledge with respect to I(P, 7,7r). To make this precise, define ip^ to be the formula 
that is obtained by replacing all Ki operators in hj Bi. (Notice that if (/? G Ck then 
ip^ & Cb-) In addition, since ranking generators now play a role in determining beliefs, 
we define an interpreted belief context to be a triple of the form (7, tt, a). 

Theorem 3.2 Let a he a deviation- compatible ranking generator for 7. For every for- 
mula (p e Ck and for all points (r, m) ofTZ — R(P, 7) and every ordering < we have 

(I(P,7,7r),r,m) h</' iff (^+(7, tt), <, (7(P), r, m) h 

Proof We proceed by induction on the structure of (p. For primitive propositions, 
the result is immediate by definition, and the argument is trivial if is a conjunc- 
tion or a negation. Thus, assume that p is of the form Kiifj. Let k = cr{P). Then 
(I(P,7,7r),r,m) ^ iff (I(P, 7, tt), r', m') ^ ^ for all (r',m') such that r' e R(P,7) 
and r[{m') = ri{m). But r' e R(P,7) iff n{r') = 0. Thus, (I(P, 7, tt), r, m) ^ Kitf) iff 
(X"'"(7, tt), r', m') |= if^^ for all (r',m') such that K{r') = and r[{m') = ri{m). Note that 
minj'(r, m) = (because K,{r) = 0). Thus, it easily follows that (I(P, 7, tt), r, m) ^ Kiip 
iff (X+(7,7r),r,m) h^i^""- ■ 

In light of Theorem 3.2, from this point on we work with the larger system tt) 
and use belief relative to deviation-compatible ranking generators, instead of working 
with the system I(P, 7, vr) and using knowledge. 

By having both ranking generators and order generators in our framework, we can 
handle both belief and counterfactual reasoning. Thus, for example, we can write 
S3(do(l,a) > (p) to represent agent 3's belief that if agent 1 were to perform action a 
in the next round, then p would hold. We can further write B-i{do{l,a) > p) > "0 
to state that were it the case that agent 3 had the above belief, then in fact ip would 
hold. Arbitrary nesting of belief and counterfactuals is allowed. To take advantage of 
the expressive features of the framework, we now define the analogue of knowledge-base 
programs, to allow for belief and counterfactuals. 

A counterfactual belief-based program (or ebb program for short) has the same form as 
a knowledge-based program, except that the underlying logical language for the formulas 
appearing in tests is now instead of Ck, and all tests in the program text Pgj for 
agent i are formulas of the form Biip or -^Biip. As with knowledge-based programs, we 
are interested in when a protocol P implements a ebb program Pg^b- Again, the idea 
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is that the protocol should act according to the high-level program, when the tests are 

evaluated relative to the countcrfactual belief-based system corresponding to P. To make 
this precise, given an extended system J' = (X, <, /c) and a ebb program Pg^^, let Pg^ 
denote the protocol derived from Pg^^ by using J' to evaluate the belief tests. That is, a 
test in Pgpj such as Biip holds at a point (r, m) relative to J" if <^ holds at all points (r', m') 
in {I, k) such that r^(m') = ri(m) and K{r') — min^(r, m). Define an extended context to 
be a tuple (7,71,0,0"), where (7,77) is an interpreted context, o is an ordering generator 
for 7^"''(7), and a is a deviation-compatible ranking generator for 7. An extended system 
(X, <,/s;) represents the belief-based program Pg^j, in extended context (7,77,0,0) if (a) 
X = X+(7, tt), (b) < = o(Pg[.'^'^''*^), and (c) k, = (^(Pg^'^'^'"''). A protocol P implements 
Pgcb in {l^T^^o.a) li P — Pg(j^^('''''^)'°(^)''^(^)). Protocol P de facto implements Pgcb in 
(7,77, a) ifP-, Pgg-C^-'TWPMP)). 

There is a close connection between the notions of implementation for knowledge- 
based programs and implementation for ebb programs using deviation-compatible rank- 
ings. Given a knowledge-based program Pgj.5, we denote by Pgfj, the program that re- 
sults from replacing every knowledge operator Ki appearing in Pg^j, to Bi, for all agents 
i = 1, . . . ,n. (This is, in particular, a ebb programs with no countcrfactual operators.) 



Theorem 3.3 Let Pgkb be a knowledge-based program and let a be a deviation- compatible 

ranking generator for '-f . Moreover, let be an arbitrary ordering generator for TZ'^ {-j) . A 
protocol P de facto implements Pg^i, in (7, tt) if and only if P de facto implements Pg^b 
in (7,77,0,0-). 

Proof Since a is deviation compatible, by Theorem 3.2, for all points (r, m) of R(P, 7), 

we have that (I(P, 7, 77), r, m) h iff (X+ (7, 77), o(P), (7(P), r, m) h v""- Let Pg,;, = Pgf, 
and let J{P) = (X+ (7, 77), o(P), (7(P)). Then 

{Pgkb)Y'''''''\n{m)) = (Pg,,)f(^)(r,(m)) whenever r e R(P,7). (1) 

Now suppose that P de facto implements Pg^^. By definition, P Ri-y Pg^j^'^''^^ Thus, the 
only global states that arise when running Pg]jjf'^''^'^ are those of the form r(m) for some 
r e R(P,7). It easily follows from (1) that X(PgJj,^'^'''\ 7, 7r) = X(Pgf/^\ 7, 77). Thus, P 
de facto implements Pg^b as well. The argument in the other direction is analogous. | 

Theorem 3.3 shows that a protocol P de facto implements a knowledge-based pro- 
gram iff P de facto implements the corresponding belief-based program. Thus, by using 
deviation-compatible rankings, ebb programs can essentially emulate knowledge-based 
programs. The move to ebb programs as defined here thus provides what may be con- 
sidered a conservative extension of the knowledge-based framework: it allows us to treat 
beliefs and counterfactuals, while being able to handle everything that the old theory 
gave us without changing the results. 
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3.3 Analysis of the Bit- Transmission Problem 



Recall the program BT^ from the introduction: if Ks{recbit) then skip else send bit. 
With this program, 5" keeps sending the bit until it knows that R has received the 
bit. As discussed in the introduction, it would be even more efficient for S to stop 

sending the bit once it knows that eventually R will receive it. As wc saw, replacing 
Ks{recbit) by Ks{Orechit) leads to problems. We can deal with these problems by using 
counterfactuals (and, thus, belief rather than knowledge), as in the ebb program BT^ 
from the introduction: 

if Bs{do{S, skip) > Orecbit) then skip else sendbit. 

This program says that S should send the bit unless it believes that even if it would 
not send the bit in the current round, R would eventually receive the bit. Similarly, the 
program BT^^ says that 5' should send the bit unless it believes that R would eventually 
correctly believe its value: 

if Bs{do{S, sk\p) > OBji{bit)) then skip else sendbit. 

(Recall that Bnibzt) is short for {bit = A BR{bit = 0)) V {bit = 1 A Sr(W = 1)).) 

Let BT> = (BT^,SKIPr) and, similarly, let BT^-^ = (BT^^, SKIP^). We now consider 
the implementations of BT^ and BT*"^ in three different contexts: 

• 7i, in which messages are guaranteed to be delivered within five rounds;^ 

• 72, in which messages are guaranteed to arrive eventually, but there is no upper 
bound on message delivery time; and 

• 73, in which a message that is sent infinitely often is guaranteed to arrive, but there 
is no upper bound on message delivery time. (Nothing can be said about a message 
sent only finitely often; this is a standard type of fairness assumed in the literature 
[Francez 1986].) 

In all contexts that wc consider, messages cannot be reordered or duplicated. Moreover, 
a message can be delivered only if it was previously sent. We assume for now that we are 
working in synchronous systems, so that processes can keep track of the round number. 
(Indeed, we cannot really make sense out of messages being delivered in five rounds in 
asynchronous systems.) At the end of this section we briefly comment on how our results 
can be modified to apply to asynchronous systems. We now describe these contexts more 
formally. 

In 7i = (Pg-^, ^Q, T^, \t'^), an agent can perform one of two actions: skip and sendbit, 
with the obvious outcome. The local state of S consists of three components: (a) a 

^There is nothing special about five rounds here; another other fixed number would do for the purposes 
of this example. 
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Boolean variable bit that is fixed throughout the run, (b) a clock value, encoded in the 
variable time, which is always equal to the round number; at a point (r, m) the clock 
value is m, and (c) the message history, which is the sequence of messages that S has sent 
and received, each marked by time at which it was sent or received. The local state of the 
receiver R consists of the clock value and i?'s message history. Assume that the set of 
initial states in 71 consists of two states — one in which bit — and one in which bit — 1. 
In both states the clock values are and message histories arc empty. In this context, 
messages are guaranteed to be delivered within at most five rounds. The environment 
can perform the action of delivering a message. Its protocol consists of deciding when 
messages are delivered, subject to this constraint. Since the environment's state keeps 
track of all actions performed, it can be determined from the state which messages are 
in transit and how long they have been in transit. makes no restrictions: all runs are 
considered admissible. 

The context 72 = (Pg , Gq,t'^, is a variant of 71 with asynchronous communication. 
Qq = Qq, and the local states of S and R are the same as in 71. Every message sent is 
guaranteed to be delivered, but there is no bound on the time it will spend in transit. 
Thus, the environment's state again keeps track of the messages in transit, while the 
environment's protocol decides at each point (nondeterministically) which, if any, 
of the messages in transit should be delivered in the current round. The constraint 
that messages are guaranteed to eventually be delivered is captured by the admissibility 
constraint the set consists of the runs in which every message sent is eventually 
delivered. 

The only difference between 73 = (Pg^, ^q, r^, and 72 is that the admissibility 
condition is more liberal than (i.e., is a superset of) \I'^. The set \E'^ consists of all 
runs r that are fair in the sense that, for every time m, if a given message /x is sent 
infinitely often in r after time m, then at least one of the copies of sent after time m 
is delivered. 

We define three sets of extended contexts, extending 7,, i = 1, 2, 3. Let ECi consist of 
all contexts of the form (7^, 7:,o,a), i — 1, 2, 3, where tt interprets the propositions bit — 
and bit = 1 in the natural way, o respects protocols, and a is deviation compatible. 

We claim that both BT^ and BT^^ solve the bit-transmission problem in every ex- 
tended context in ECi, i = 1,2,3. But what does it mean for a protocol to "solve" the 
bit-transmission problem? To make this precise, we need to specify the problem. In the 
case of the bit-transmission problem, the specification is simple: we want the receiver to 
eventually know the bit. Thus, we say that a ebb-program Pg solves the bit-transmission 
problem in extended context C, = (7, n, o, a) if, for every protocol P that de facto imple- 
ments Pg, we have that {J'^{P,Q,r,0) \= OBji{bit) for every run r e R(P, 7). Notice 
that using belief here is safe, because we are requiring only that the belief hold in runs 
of P. Lemma 3.1 guarantees that, in these runs (which all have rank 0), the beliefs are 
true. 

Theorem 3.4 Both BT^ and BT*^ solve the bit-transmission problem in all the ex- 
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tended contexts ECi U EC2 U EC^. 



Proof Let C = (7, tt, o, a) be a context in ECi U EC2 U i^Cs and assume that P de 
facto implements BT> or BT*-^ in C- Let = (J(P, 7), o(P), ct(P)) and let r e R(P,7) 
be a run of P in 7. We first consider the case that P implements BT^; the argument in 
the case that P implements BT*^ is even easier, and is sketched afterwards. There are 
two cases: 

(a) Suppose that {J, r, m) |= Bs{do{S, skip) > Orecbit) for some m > 0. Since P de 
facto implements BT^, 5' performs skip in round m + 1 of r. Thus, we have that 
{J',r,m) \= do(S', skip). Since cr(P) is deviation compatible and r G R(P, 7), 
it follows that {J', r, m) |= do{S, skip) > Orecbit. Since o respects protocols, 
(r, m) G closest([do(5', skip)], (r, m), JT"). It now follows from the semantics of 
> that {J,r,m) |= Orecbit. Since P de facto implements BT^, if 5* sends a value 
in a run r' of P, S is actually sending the bit. Since o"(P) is deviation compatible, it 
follows that in every run r' of P, we have that (jT", r', m') \= recbit =^ Bfi{bit), since 
all the points in minij(r',m') are points on runs of P. Thus, {J',r,m) \= Babbit). 

(b) Suppose that {J', r, m) \^ Bs{do{S, skip) > Orecbit) for all m > 0. Since P de facto 
implements BT^, it follows that S sends the bit in every round of r. (In particular, 
the bit is sent by S infinitely often.) All three contexts under consideration have the 
property that a message sent infinitely often is guaranteed to be delivered. Thus, at 
some time m' > in r, the receiver will receive the bit; that is, {J', r, m') |= recbit 
for some m' > 0. we have by Then, just as in part (a), it follows that {J^,r,m') |= 
BR{bit), and hence that (J,r,0) ^ OBu^bit). 

The argument is almost identical (and somewhat simpler) if P implements BT*^. 
Now we split into two cases according to whether there is some m such that {J , r, m) \= 
Bs{do{S, sk\p) > OB[i{bit)). Using the same arguments as above (but skipping the 
argument that J' \= recbit ^ BR^bit)) we get that, in both cases, (J7',r, 0) |= OB^^bit). 
I 

Theorem 3.4, while useful, does not give us all we want. In particular, it shows 
neither that BT^ or BT^'^ is implementable nor that S sends relatively few messages 
according to any protocol that implements BT^ or BT*^ (which, after all, was the goal 
of using counterfactuals in this setting). In fact, as we now show, both BT^ and BT*^ 
are implementable in all three sets of contexts, and their implementations are as message- 
efficient as possible. We consider each of ECi^ PC2,and PC3 in turn. 

Intuitively, in order to solve the bit-transmission problem in a context in which mes- 
sages are always delivered, sending the bit only once in any given run should suffice. 
Consider the collection of protocols P^{k,m) = (Pj(A;, m), SKIPr) for /c,m G N, where 
Pg{k,m) is described by the program 

if {time = k and bit = 0) or (time = m and bit = 1) then send bit else skip. 
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In these protocols, the sender S sends its bit at time k if the bit value is 0, and at time m 
if it is 1. We now show that all protocols of the form P^(k,m) implement BT^ in all 
contexts in ECi. 

Lemma 3.5 The protocol P^{k, m) de facto implements BT^ in every extended context 
in ECi. 

Proof Fix k, m, and a context C = (7ii^)Ci, k) G ECi. We want to show that 
P\k,m) (BT>)'^('='"^), where J{k,m) = {I+{-fi,ni),o{P\k,m)),a{P\k,m))). We 
can characterize a run consistent with P^{k,m) by the value of bit and when the one 
message sent by S is received. Let rb^n be the run where bit = b and the message is 
received at time n (clearly A; + 5>T2,>A;if6 = and m + 5 > n > m ii b = 1). Clearly 
the formula recbit holds in run rf,^„ from time n on. Thus, Orecbit holds at every point 
in every run consistent with P^{k,m) in the system J{k,m). Note that the runs r^^n are 
precisely those of rank in J" (A;, m). 

We now show that a run r is consistent with (BT^)-^*^'^'"*) in 71 iff r = r6,„ for b e {0, 1} 
and a value of n satisfying k + b>n>k\{b = Q and m -\- b > n > m ii b — 1. So 
suppose that r is consistent with (BT^)"'''^^'™) and the value of the bit in r is 0. It 
suffices to show that S sends exactly one message in r, and that happens at time k. 
If n' ^ k, then clearly {J'{k,m),r,n') \= (5*, skip) > Orecbit, since the closest point 
to (r, n') where do(5', skip) holds is (r, n') itself. On the other hand, if n' — k, then 
closest(|do(5', skip)], (r, n'), i7(A;, m)) = {(ro,n')}, where Tq is the run where S never 
sends any messages and the initial bit is 0. In this case, the properties of 71 guarantee 
that no message is ever received by R in r', and Orecbit does not hold at (r', k). It follows 
that the test Bs{do{S, skip) > Orecbit) fails at (r, k), and r is consistent with BT^ if and 
only if the action sendbit is performed in round A; + 1 of r. Hence, r is one of the runs 
ro,n with k + 5 > n > k. A completely analogous treatment applies if bit = 1 in r. We 
thus have that exactly the runs r^^ described are consistent with (BT^)'^*^'^'™'' in 71, and 
hence P^{k, m) de facto implements BT^ in every extended context in i?Ci, as desired. | 

In the context 71, there is a fixed bound on message delivery time. As a result, 
we might hope to save on message delivery in some cases. Suppose that we use a one- 
sided protocol, that sends the bit only if bit = 0. Then the receiver should be able 
to conclude that the value of the bit is 1 if a message stating the bit is does not 
arrive within the specified time bounds. More generally, define the collection of protocols 
P^{k,b) = {Pl{k,b),SK\PR) for b e {0,1} and k e N, where Pl{k,b) is the protocol 
implementing the program 

if time — k and bit — b then sendbit else skip. 

According to P^ik, b), the sender 5" sends a message only in runs where the bit is b; if the 
bit is 1 — 6, it sends no messages. Moreover, in runs where the bit is 6, S sends only one 
message, at time k. This type of optimization (sending a message only for one of the two 
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bit values) was used in the message-optimal protocols of [Hadzilacos and Halpern 1993]; 
it can be used in synchronous systems in which there is an upper bound on the message 
delivery time, as in contexts in ECi. 

It is easy to verify that P'^{k,b) does not implement BT^: Intuitively, in a run r of 
P'^{k,b) with bit = 1 — b, the sender S never sends the bit, and hence Orecbit does not 
hold. Since S follows P'^{k, b) in r, the formula do{S, skip) holds at time in r. It follows 
that in evaluating the test Bs{do{S,sk\p) > Orecbit) the closest point to (r, 0) is (r, 0) 
itself. Because Orecbit does not hold at that point, the test fails, and according to BT^ 
the sender S should perform sendbit. Since, in fact, S does not perform sendbit at (r, 0), 
and r is a run of P^{k, b), we conclude that b) does not implement BT^. However, 

as we now show, P'^{k, b) does implement the more sophisticated program BT^^: 

Lemma 3.6 Every instance of P'^{k,b) de facto implements BT*^ in every context in 



Proof Fix A;, 6, and a context C = {li:'^)0,a) G ECi. We want to show that 
P^{k,b) (BT*^)-^(*^'^), where J(fc,6) = {I+{-ir,T:),o{P^{k,b)),a{P^{k,b))). Note 
that there arc exactly six runs consistent with P^{k,b) in context 71: five runs r^, 
m = k -{- 1, . . . , k + 5, where the value of the bit is b, the message is sent in round k + 1 
and it arrives in round m; the sixth run is ri_5, where the value of the bit is 1 — 6 and 
no message is sent. It is easy to check that in the extended system J'{k, b), the formula 
bit — b A Babbit — b) holds in runs from time m on, while in run ri_6 the formula 
bit — 1 — b /\ Bii{bit — 1 — b) holds from time A; + 5 on. Thus, OBR{bit) holds at every 
point in the six runs in R(P^(A;, 6), 71). Note that these six runs are exactly the runs of 
rank 0. 

We now show that r is consistent with (BT^^)'^('^'^) iff r G R{P^{k,b),-fi). We 
consider two cases, according to the values of the bit in r. First suppose that bit = 1 — b 
in the run r. We prove by induction on m' > that (a) if r is consistent with BT^*^*^'*''' 
then (i) r(m') = ri_ft(m') and (ii) {BT^^)g^'^'^\rs{m')) — skip, and (b) ri_6 is consistent 
with (BT*'^)^'''^'''^ up to time m'. For the base case, observe that r(0) = ri_fc(0) because 
there is only one initial state in 71 with bit = 1 — b. Clearly ri_f, is consistent with 
j^g-pOB^ J'Cfc.fe) time 0. Thus, parts (a)(i) and (b) hold. For part (a)(ii), to see that 

(BT*^)^^'''''^(r5(0)) = skip, it suffices to show that {J{k,b),r,0) |= Bs{do{S, sk:\p) > 
OBji{bit)). Since ai is deviation compatible and S knows that bit = 1 — b, it follows 
that min5^'^'^'''^^^(r,0) = {(ri_6,0)}. Thus, it suffices to show that (J'(A;, 6), ri_6, 0) |= 
(^0(5", skip) > OBji{bit). But this is immediate from the fact that ( 6), ri_6, 0) |= 
do(5', skip) and, as observed earlier, that ( 6), ri_6, 0) |= OBji{bit). 

For the inductive step in the case bit = 1 ~ b, assume that the inductive claim holds 
for time m' > 0. We want to show that it holds at time m' + 1. Part (a)(i) and (b) are 
immediate from the inductive hypothesis. The argument for part (a) (ii) is the same as in 
the base case. This completes the inductive argument. It follows immediately from the 
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induction that ri_b is consistent with BT^^*^'''^ and that if r is consistent with BT^ ^'^'^^ 
and bit = 1 — b in r, then r = ri-b- 

Now consider the case where bit = 6 in r. Define b-runs to be the set {r^+i, rfc+2 . . . , Tfc+5}, 
and b-pts(m') to be {{rk+ijin.'), {rk+2,'nT.'), . . . , {rk+5,m')}. We show by induction on 
m' > that if r is consistent with (BT*-^)^(*='*), then 



(a) r{m') e b-pts(m'), 

(b) (BT^^)'^(^''')(r5(m')) 



skip \lm' 
sendbit if m! — k, 



(c) at least one run in b-runs agrees with r up to time m'; moreover, if m' > /c + 5, 
then exactly one run in b-runs agrees with r up to time m' . 

For the base case, it is again immediate that r(0) G b-pts(O) and that all runs in b-runs 
agree with r up to time 0. To see that part (b) holds, first note that min^*'''^ _ 
{{r^' , 0) : A;' = 1, . . . , 5}. There are now two cases: if A; = (so that S sends a mes- 
sage in round 1 of all the runs in b-runs), then we must show that 6), r, 0) |= 
^Sg(do(5,skip) > OBnibit)), so that Blf '^'''^(r5(0)) = sendbit. Note that, if A; = 0, 
then closest{ldo{S,sk\p)j, {r''' ,0), J (k.b)) = {r*} for A:' = 1,...,5, where r* is the run 
where bit = b and no messages are ever sent by S or R. Thus, it suffices to show that 
{J'{k,b),r*,0) 1= -iOBpi{bit). It is easy to see that, since is deviation compatible, 
we must have (ri_6,m) e min^'-^ '■^'''■'■'(r*, m), for all m > 0. Thus, {J',r*,m) ^ bit — 
l-bABR{bit ^ 1-b) for all m > 0, and hence {J,r*,m') ^ ^OBnibit) for all m' > 0, as 
desired. On the other hand, if A; > 0, we must show that {J'{k, b),r, 0) |= Bs{do{S, skip) > 
OBnibit)). Note that if A: > 0, then closest(|do(5, skip)], (r'^', 0), J'(A;, 6)) = {r'^'}, for 
A;' = 1, . . . , 5. Since {J{k, b),r^\ 0) |= do{S, skip) A BR{bit)), we are done. 

The argument in the inductive step is almost identical, except that it now breaks into 
the cases m' < k, m' — k, k < m' < k + 5, and m' > k + 5. We leave details to the 
reader. 

Finally, we must show that each run r G b-runs is consistent with (BTf-^)^('='*). We 
proceed by induction on m' to show that r is consistent with (BT^^)'^^'^''^'^ up to time m'. 
This involves proving part (b) of the induction above for each r G b-runs. The proof is 
similar to that above, and left to the reader. | 

The preceding discussion has shown that P'^(k,b) implements BT^^, but not BT^, in 
contexts in ECi. Lemma 3.5 shows that P^{k,m) implements BT^ in contexts in ECi. 
An obvious question is whether P^{k, m) implements BT*^ in contexts in ECi. We now 
show that if /c 7^ m, then P^{k,m) does not implement BT*^; if A; = m, then whether 
P^{k,m) implements BT*^ depends on what the receiver believes in runs where he does 
not receive a message. Since there is no run of P^{k,m) where the receiver receives no 
messages, this is not determined by just assuming that we have a deviation-compatible 
ranking generator. Given a ranking k, let K,{n, b) be the rank of the run with least rank 
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where (a) the receiver does not receive any messages up to and including time n and (b) 
the bit has value b. We say that a ranking k is biased if K{n, 0) 7^ K{n, 1) holds for at least 
one time instant n. Note that if K{n, i) < K{n, i(Bl) then, in the absence of messages, R 
will believe that the bit is i at time n. 

Lemma 3.7 Let ( = (71,71,0,(7) e ECi. The protocol P^{k,m) de facto implements 
BT^^ in C exactly if both (a) k — m and (b) a{P^{k^ k)) is not biased. 

Proof Fix a context ( = (7i,7r,o, cr) G ECi. As in the proof of Lemma 3.5, define 
J{k,m) = {X~^{ji,7r),o{P^{k,'m)),a{P^{k,m))) and the runs r6,„. 

First suppose that a{P^{k,k)) is not biased. We show that P^{k,k) de facto imple- 
ments BT^^ in C. By definition, in each of the ten runs r^^n of rank in the extended 
system J^{k, k), recbit holds at the time n when the receiver R receives the bit. Since R 
receives the correct bit, it is easy to sec that in fact {J'{k,k),rhn,n) \= B^{bit). Thus, 
OB^ibit) holds at every point in the ten runs of the form r^^n in the system (BT*'^)"^*^'^''^'^ 
Moreover, {J'{k,k),rb,n,Tn) \= do(S', skip) > OBnibit) for m ^ k. Since the runs r^^n 
arc the runs of rank 0, it actually follows that {J'{k,k),ri,^n,'>TT') \= Bs{do{S, sk\p) > 
OBR{bit))form^k. We now show that {J{k,k),n.n,k) \= -^Bs{do{S,sk\p) > OBnibit)) 
Note that closest(|do(5', skip)], (r^ ^. A;), J{k, k)) = {(r^, A;)}, where is the run where 
the bit is b and S sends no messages. Suppose that {J'{k, k),r'i„ k) \= OBji{bit = b). 
Thus, there is some n > k such that {J{k,k),r'^,n) \= Bn^bit = b). Then we must 
have K{n,b) < K.{n,b Q) 1), so that k is biased, contradicting the assumption. Thus, 
{Jik,k),ri,k) h ^OBnibit = 0), so (jr(A:, A;), r,,„, A:) |= ^B s (do (S, skip) > OBnibit)), 
as desired. In this case, by (BT*'^)"^*^^''^'', the sender S should perform send bit at time k. 
It follows that 7\,i is consistent with (BT*'^)^^''''''). 

We next show that if r is consistent with (BT*'^)'^*^^''^\ then r G {r^^n : 6 = 0, 1, n = 
A; + 1, . . . , A; + 5}. So suppose that the bit is in r and that r is consistent with 
(BT^^)'^('='*^). Just as in the proof of Lemma 3.6, it is easy to show by induction on 
m that no messages are sent in r at time m < k: It is easy to see that {J{k, k), r, m) |= 
-B5(do(S', skip) > OB^{bit)) for k < m, since (r,m) (r5,„,m). Just as in the case of 
'r'h,n-i we can show that {J'{k, k),r, k) \= -^Bs{do{S, sk\p) > OBn^bit)). Thus, since r is 
consistent with (BT^^)'^^'^''^^ the sender S sends a message at time k in r. It is easy to 
show that S does not send the bit after time k; we leave details to the reader. Thus, if r 
is consistent with (BT^^)'^*^'^'^) then S sends the bit in r at time k (and does not send it 
at any other time), so r is of the form r^^. 

We next claim that if A; 7^ m then P^{k, m) does not de facto implement BT*^ in 
Without loss of generality, suppose that k < m. By the properties of 71, messages 
can take up to five time units to be delivered. Hence, there is a run of P^{k,m) with 
bit = 1 in which the sender's message is not delivered by time m + 4. However, because 
k < m, there is no run with bit = where no message is delivered by time m + 4. 
Because a is deviation compatible, it follows that K{m + 4, 1) = < K{m + 4, 0). Thus, 
{J{k,m),ri^rn+4,m + A) \= BR{bit = 1), so {J{k,m),ri^rn+j,'m) ^ B5(do(S', skip) > 
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OBnipit)) for J = 1, ... , 5. Therefore, S should not send the bit at time m according 
to (BT^^)'^('^'™) in runs where the bit is 1, showing that P^{k,m) does not de facto 
implement BT*^. 

To complete the proof of the lemma, we need to show that if /t = a{P^{k,k)) is 
biased, then P^{k,k) does not implement BT^'^ in So suppose that k = a{P^{k, k)) 
is biased. Since k, is biased, there is an n for which K(n, 0) ^ K{n, 1). Without loss of 
generality, assume that K{n, 0) < K{n, 1). We must have n > k, since K{i, 0) = 1) = 
for all £ < k, because in all runs consistent with P^{k,k), the receiver R receives no 
messages up to time £. It follows that {J'{k,k),r,k) \= OBfi{bit = 0) for all runs r 
consistent with P^{k,k). Thus, (i7(A;, A;), ro,fc+j, m) |= Bs{do{S, sk\p) > OBji{bit)) for 
j — 1, ... ,5. It follows that, in runs where the bit is 0, 5 should not send the bit according 
to (BT'^'^)'^('=''=). This again establishes that P^{k, k) does not de facto implement BT*^. 
I 

Now consider the context 72. Here there is no upper bound on message delivery times. 
As a result, S must send R messages regardless of what bit value is. 

Lemma 3.8 Every instance of P^{k,m) de facto implements both BT^ and BT*^ in 
every context in EC2. 

Proof The proof for the case of BT^ is identical to the proof given for contexts in ECi 
in Lemma 3.5. There are now infinitely many runs rf,„ consistent with P^{k,m) rather 
than ten runs, but the argument remains sound. We leave details to the reader. 

In the case of BT*^, the argument follows the same lines as the proof Lemma 3.5, 
except that the role of Orecbit is now played by OBn^bit). Fix k, m, and a context ( — 
(72, TT, o, a) e EC2. We want to show that P\k, m) ai^^ (BT*'^)'^'^*^''"), where J'{k, m) = 
(X+(72,7r),o(P^(A;, m)),(T(P^(A;, m))). It is easy to check that in the extended system 
J''{k,m), the formula Bji(bit — b) holds in run r^^ from time n on. Thus, ■OBji[bit) 
holds at every point in every run consistent with P^{k, m) in the system J'{k, m). Note 
that the runs r^^^ are precisely those of rank in J''{k,m). Finally, note that if (r',n) 
is an arbitrary point in J^'{k, m) with n > max(A;, m) and no messages are sent in r' up 
to time n, then {J''{k,m),r' ,n) |= -'Bii(bit = 0) A -iBR{bit — 1), since there are runs 
consistent with P^{k, m) where no messages arrive up to time n and the bit can be either 
or 1; for example, {ro,n+i,n) ~r (r',n) and (ri,„+i,n) ~i? (r',n). 

We now show that a run r is consistent with (BT^'^)"'"*^^'™) in 72 iff r = r^^n for 
b e (0, 1} and n > 0. So suppose that r is consistent with (BT^^)'^ '^^'™) and the 
value of the bit in r is 0. It suffices to show that S sends exactly one message in r, 
and that happens at time k. The argument is very similar to that in Lemma 3.6. If 
n < k, then clearly {J''[k,m),r,n) \= (5*, skip) > OBji{bit), since the closest point 
to (r, n) where do(S', skip) holds is (r, n) itself. On the other hand, if n = k, then 
closest([do(5', skip)], (r, n), J''(A;, m)) = {{r'Q,n)}, where Tq is the run where S sends 
no messages and the initial bit is 0. As observed earlier, we have {J''{k,m),rl,n) 
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D{^BR(bit = 0)A^BR{bit = 1)), so {J'{k,m),rb,n,n) h -(do(5,skip) > Buibit)). 
Thus, since r is consistent with (BT^'^)'^'('^'™) in 72, 5" sends its bit at time k in r. 
Finally, if n > k, again we have closest(|do(S', skip)], (r, n), j7''(/i;, m)) = {(r, n)} so, 
again, 5" does not send a message at time n in r. Thus, r has the form ro,n' for some n'. 
The same argument shows that all runs of the form ro,n' are in fact consistent with 
(^QjOB^j'{k,m)_ The argument if 6 = 1 is identical (with m replacing k throughout). | 

Finally, we consider the contexts in EC^. In this case, communication is such that 
if R sends no messages, then S is guaranteed to have one of its messages delivered only in 
case it sends infinitely many message. This says that if we consider only protocols of the 
form {Ps, SKlPji), then S must send infinitely many messages in every run. However, if a 
protocol sends infinitely many messages, then no particular one is necessary; if S does not 
send, say, the first message, then it still sends infinitely many, and R is guaranteed to get 
a message. This suggests that we will have difficulty finding a protocol that implements 
BT^ or BT*^. The following proposition prevides further evidence of this. If / C iV (the 
natural numbers), let P{I) — {Ps{I), SK\Pr), where Ps{I) is described by the program 

if time e / then sendbit else skip. 

Thus, with Psil), the sender S sends the bit at every time that appears in I. 

Proposition 3.9 No protocol of the form P{I) de facto implements either BT^ or BT^^ 
in any conteoct in EC3. 

Proof We sketch the argument here and leave details to the reader. First suppose 
that / is finite. Let r be a run in P{I) where none of the finitely many messages sent 
by S is received. Let n — sup(/) + 1. Suppose that (73,7r, o, cr) e EC^. Let = 
(X+(73,7r),o(P(/)),(7(P(J))). Clearly, closest(|do(>S, skip)], (r, n), J(/)) = {(r)}, since 
S performs the act skip at {r,n). However, since R never receives the bit in run r, and 
a{P{I)){r) = 0, it follows that {J{I),r,n) |= ^OrecMt and (J^(J),r,n) |= ^Bnibit). 
Thus, according to both BT^ and BT^^, S should send a message at (r, n). It follows 
that P{I) does not implement BT^ or BT^"^. 

Now suppose that / is infinite. The properties of 73 ensure that R does in fact receive 
the bit in every run of P{I). Moreover, it is easy to check that when the message is 
received, both recbit and Bf>{bit) hold. Hence, for any given clock time m E I, the 
formulas do(S', skip) > Orecbit and do(5', skip) > OBf>{bit) hold at time m in all runs of 
the protocol. A straightforward argument shows that sendbit is neither compatible with 
BT^ nor with BT^"^ at time m. I 

Intuitively, Proposition 3.9 is a form of the "procrastinator's paradox": Any action 
that must be performed only eventually (e.g., washing the dishes) can always safely be 
postponed for one more day. Of course, using this argument inductively results in the 
action never being performed. 



27 



Despite Proposition 3.9, we now show that BT^ and BT are both implementable 
in all contexts in EC^. Let — SKIP^j), where Pg is the protocol determined by 
the following program: 

if time = or send bit was performed in the previous round, then send bit else skip. 

Since S"s local state contains both the current time and a record of the time at which 
it sent every previous message, it can perform the test in P'^(S'). It is not too hard to 
see that is de facto equivalent to P(1N) in 73 — under normal circumstances the bit 
is sent in each and every round. The two protocols differ only in their counterfactual 
behavior. As a result, while P{1N) implements neither BT^ nor BT^^, the protocol P'^ 
implements both. 

Lemma 3.10 P^ de facto implements both BT^ and BT*^ in every context in EC3. 

Proof We provide the proof for BT*^. The proof for BT^ is similar, and left to the 

reader. 

Fix a context Cs = (73,7r,o,(T) G EC3. We want to show that P'^ (BT^^)-^", 
where J'^ = (X+(73, vr), o{P'^),a{P'^)). Let = R(P'^, 73). Note that, for every natural 
number k, there are runs r^^fc G in which bit — b and no message that is sent by S in 
the first k rounds is ever dehvered to R. It follows that if R has received no message by 
time m in run r of R^, then {J'^,r,m) |= -iBji{bit). 

We now prove by induction on k that a run r is consistent with (BT^'^)"^'^ in 73 for k 
rounds exactly if 5" has performed send bit in each of the first k rounds of r. The base case 
for A; = is vacuously true. For the inductive step, assume that the claim is true for k = i. 
Suppose that r is consistent with (BT*^)'^'^ for £+1 rounds. By the induction hypothesis, 
the sender S has performed send bit in each of the first i rounds. Since r is, by assumption, 
consistent with (BT*^)"^" for £ + 1 rounds, S performs sendbit in round £+ 1 of r exactly 
if {J'^,rJ,) \= ^Bs{do{S, sk\p) > OBji{bit)). Let bit = bmr. Moreover, (T3(P'^)(r) = 
since (J3 is deviation compatible. Clearly {r,i) {''^b/), where r^^i e R^ is the run 
constructed earlier where none of the message sent by S in the first i rounds arrive, since 
in both r and r?,,^, the bit is the same and 5* sends a message in each of the first i rounds. 
Moreover, a{P'^){rb,e) = 0, since a is deviation compatible and rb^i G R^. Thus, to show 
that (J'^.rJ) 1= ^Bs{do{S, sk\p) > OBR^bit)), it suffices to show that {J"^ ,ri,jj) ^ 
^(do(S',skip) > OBji{bit)). The points in closest(|do(S', skip)], (r, £), J'^) have the form 
(r', £) where r' agrees with rb/ up to and including time i, S does nothing in round £ of r', 
and 5* follows P'^ in all rounds after £ in r'. The key point here is that, by following P'^, 
S sends no messages in r' after round £. Consequently, in all runs appearing in this set of 
closest points, S sends a finite number of message (exactly £, in fact). By the admissibility 
condition of 73, there is one run in this set, which we denote by f, in which R receives 
no messages. Note that {f,n) ~k (?"o,n) ^) and (f,n) ~r (ri^„,n), since in all of f, ro,„ 
and ri^n, the receiver R receives no messages up to time n. Since both ro,n and ri^„ are in 
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R^, it follows that they both have rank 0. Thus, {J^^, r, n) |= -^Br{bit). That is, Bnibit) 
never holds in f. It follows that {J''^, ri,j. £) \= ^{do{S, skip) > OB^{bit)), as needed. We 
can thus conclude that r is consistent with (BT*'^)'^" in 73 for £ + 1 rounds exactly if 5' 
performs send bit in the first i + 1 rounds, and we are done. I 

Lemma 3.10 shows one way of resolving the procrastinator paradox: If one decides 
that an action (e.g., washing the dishes) that is not performed now will never be per- 
formed, then performing it becomes critical. (We are ignoring the issue of how one can 
"decide" to use such protocol. In the context of distributed computing, we can just make 
this the protocol; people are hkely not to believe that this is truly the protocol.) In any 
case, using such a protocol makes performing the action consistent with the procrastina- 
tor's protocol of doing no more than what is absolutely necessary. 

We can summarize our analysis of implementability of BT^ and BT*^ by the following 
theorem: 

Theorem 3.11 Both BT^ and BT*^ are de facto implementable in every extended 
context in ECi U EC2 U EC-^ Moreover, if P de facto implements BT^ or BT^^ in a 
context Q G ECi U EC2, then S sends at most one message in every run consistent with 
P in Q. 

Proof The implementability claims follow from Lemmas 3.5, 3.6, and 3.10. We now 

prove that S sends no more than one message in every run of a protocol that de facto 
implements BT^ or BT^'^ in a context in ECi U EC2- Suppose that P = (Ps,Pr) de 
facto implements BT^f in ( = (7,71,0,(7) G ECi U EC2- Further suppose, by way of 
contradiction, that there is a run r consistent with P in 7 in which the sender sends 
more than one message. Suppose that the second message is sent at time k, and the 
value of the bit in r is b. Let J — (X"'"(7, tt), o(P), cr(P)). Since 7 G {71,72}, all 
messages are guaranteed to arrive eventually in the context 7. Thus, it is easy to see 
that iJ,r,k) \= Bs{OBR{bit = b)). It follows that iJ,r,k) |= do(5, skip) > Bu^bit). 
Since P is de facto consistent with BT^, this means that S should not send a message at 
(r, k). This is a contradiction. | 

All the contexts we have considered are synchronous; the sender and receiver know 
the time. As we observed earlier, there is no analogue of 71 in the asynchronous setting, 
since it does not make sense to say that messages arrive in 5 rounds. However, there are 
obvious analogues of 72 and 73. Moreover, if we assume that S"s local state keeps track 
of how many times it has been scheduled and what it did when it was scheduled, then 
the analogue of P'^(k,m) implements both BT^ and BT*'^ if messages are guaranteed to 
arrive (where now P'^{k, m) means that if bit = 0, then the A;th time that S is scheduled 
it performs send bit, while if bit — 1, then the mth time that S is scheduled it performs 
send bit). Similarly, the analogue of P^ implements both BT^ and BT*^ in contexts that 
satisfy the fairness assumption (but any finite number of messages may not arrive). 
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4 Discussion 



This paper presents a framework that facihtates high-level counterfactual reasoning about 
protocols. Indeed, it enables the design of well-defined protocols in which processes act 
based on their knowledge of counterfactual statements. This is of interest because, in 
many instances, the intuition behind the choice of a given course of action is best thought 
of and described in terms of counterfactual reasoning. For example, it is sometimes most 
efficient for agents to stop exending resources once they know that their goals will be 
achieved even if they stop. Making this precise involves counterfactual reasoning; this 
agent must consider what would happen were it to stop expending resources. 

This paper should perhaps best be viewed as a "proof of concept"; the examples 
involving the bit-transmission program show that counterfactuals can play a useful role in 
knowledge-based programs. While wc have used standard approaches to giving semantics 
to belief and counterfactuals (adapatcd to the runs and systems framework that we are 
using), these definitions give the user a large number of degrees of freedom, in terms 
of choosing the ranking function to define belief and the notion of closeness needed 
to define counterfactuals. While we have tried to suggest some reasonable choices for 
how the ranking function and the notion of closeness are defined, and these choices 
certainly gave answers that matched our intuitions in all the context we considered for 
the bit-transmission problem, it would be helpful to have a few more examples to test the 
reasonableness of the choices. We are currently exploring the application of ebb programs 
for analyzing message-efficient leader election in various topologies; we hope to report on 
this in future work. 

While we used the very simple problem of bit transmission as a vehicle for introduc- 
ing our framework for knowledge, belief, and counterfactuals, we believe it should be 
useful for handling a much broader class of distributed protocols. We gave an example 
of how counterfactual reasoning is useful in deciding whether a message needs to be sent. 
Similar issues arise, for example, in deciding whether to perform a write action on a 
shared-memory variable. Because our framework provides a concrete model for under- 
standing the interaction between belief and counterfactuals, and for defining the notion 
of "closeness" needed for interpreting counterfactuals, it should also be useful for illu- 
minating some problems in philosophy and game theory. The insight our analysis gave 
to the procrastinator's paradox is an example of how counterfactual programs can be 
related to issues in the philosophy of human behavior. We believe that, in particular, the 
framework will be helpful in understanding some extensions of Nash equilibrium in game 
theory. For example, as we saw in Lemma 3.7, whether a protocol de facto implements 
a ebb program depends on the agent's beliefs. This seems closely related to the notion 
of a subjective equilibrium in game theory [Kalai and Lehrer 1995]. We are currently 
working on drawing a formal connection between our framework notions of equilibrium 
in game theory. It would also be interesting to relate the notion of "closeness" defined in 
our framework to that given by the structural-equations model used by Pearl [2000] (see 
also [Halpern 2000]). The structural-equations model also gives a concrete interpretation 
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to "closeness"; it does so in terms of mechanisms defined by equations. It would be 
interesting to see if these mechanisms can be modeled as protocols in a way that makes 
the definitions agree. 
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